Skip to content

explore(agent-wiki): trajectory-derived wiki — skills, builder, experiments#268

Merged
vinodmut merged 9 commits into
AgentToolkit:mainfrom
vinodmut:explorations/agent-wiki
Jun 10, 2026
Merged

explore(agent-wiki): trajectory-derived wiki — skills, builder, experiments#268
vinodmut merged 9 commits into
AgentToolkit:mainfrom
vinodmut:explorations/agent-wiki

Conversation

@vinodmut

@vinodmut vinodmut commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Related to #256 — this is a prototype of offline trajectory-mining + consolidation ("dreaming"): reviewing saved trajectories to extract, consolidate, deduplicate, and curate memory outside the main task loop, with an auditable record of what changed.

What this is

An exploration in turning agent trajectories into a reusable, evidence-grounded wiki that future agents consult before acting — plus the experiments measuring whether it helps. Everything lives self-contained under explorations/agent-wiki/.

The core idea: after an agent finishes a task, distill its trajectory into wiki pages — episodic summaries, atomic guidelines, themed cluster pages, and executable skills — each linked back to the trajectory that produced it. A future agent, pointed at the wiki's AGENTS.md, retrieves the pages relevant to its task and applies them instead of re-deriving the recipe.

How this maps to #256 ("dreaming")

#256 asks for provided here
extract useful memories from raw trajectories after the fact agent-wiki-summarize / -extract-guidelines / -synthesize-skill (retroactive + batch ingest)
consolidate duplicate / overlapping guidelines agent-wiki-consolidate-guidelines → cluster pages
promote repeated observations; detect stale / redundant entities delete-on-promote (--archive-covered), recall roll-up, priority tiers
auditable summary of what changed and why _audit.log + provenance back-links on every page

Layout

explorations/agent-wiki/
├── skills/        7 agent-wiki skills + build_agent_wiki.py (reference copy)
├── docs/          design.md (rationale) + schema.md (on-disk format)
├── experiments/   RESULTS-SUMMARY + comparison reports; metrics/ rollups; harness/ scripts
└── wikis/         worked examples: wiki-twobatch {base, skills, both, pruned}

Headline findings (experiments/RESULTS-SUMMARY.md)

  • Wiki vs no wiki: −20% cost, −38% duration, −43% tool calls at unchanged accuracy (16-task A/B).
  • Skills > guidelines: a skills-only wiki beats a guidelines-only one on cost (−14%) and matches accuracy.
  • Pointer wording is load-bearing: a strong-imperative CLAUDE.md pointer is read 3/3; a soft one 1/3.
  • Composition > size: piling guidelines on top of skills is the worst populated wiki; delete-on-promote (archive skill-covered atomics) beats it but skills-only stays cheapest.

Scope / data note

These are benchmark-derived example wikis (a synthetic 16-task file-format corpus). Raw per-trial sandbox transcripts and any wikis built from internal trajectory corpora are intentionally excluded — only metric rollups, narrative reports, and the benchmark-derived wikis are included. Source links in wiki frontmatter are shown in the generic trajectories/<session-id>.json form. The skills are a standalone reference copy, not wired into a plugin loader.

Summary by CodeRabbit

  • New Features

    • Added an agent-wiki exploration with CLI experiment and analysis tooling to run, normalize, score, and compare wiki-consult experiments and render summary reports.
  • Documentation

    • Large collection of design, schema, skill, and experiment write-ups describing wiki formats, ingestion/synthesis/consolidation workflows, experiment results, and usage guides.
  • Chores

    • Excluded generated example wiki content from secret scanning and lint/type checks; updated secret-scan baseline and configs.

vinodmut added 2 commits June 10, 2026 00:54
Adds explorations/agent-wiki/ — the agent-wiki skill family, builder, design
+ schema docs, the wiki-helps experiment reports, and benchmark-derived
example wikis, all under one tree suitable for a public PR.

Contents:
  - skills/        7 agent-wiki skills + build_agent_wiki.py (reference copy,
                   not plugin-wired)
  - docs/          design.md + schema.md
  - experiments/   RESULTS-SUMMARY + twobatch comparison reports +
                   pruned-index-hypothesis; metrics/ rollups (no raw
                   transcripts); harness/ runner + compare scripts
  - wikis/         wiki-terminalbench-bob + the twobatch arms
                   (base / skills / both / pruned-corrected)

Public-safety scrub:
  - Excluded all raw per-trial sandbox transcripts (kept only metric
    rollups + narrative reports).
  - Excluded wikis built from internal corpora (procedural-design,
    consult-meta, iterative, retroactive, simple-claude, test-paired,
    claude) and the build-pattern comparison that ran on them; §3-4 of
    RESULTS-SUMMARY reduced to a portable-finding note.
  - Rewrote all source-path frontmatter to the generic
    trajectories/<session-id>.json form; genericized internal example
    names and the benchmark-data dir convention in skills/docs.
  - Leak gate (benchmark-data / internal corpus + wiki names / org paths)
    passes with zero hits across the tree.

Branched off main; diff touches only explorations/agent-wiki/. Builder
catalog + comparison scripts verified runnable from the new location.
Removes the terminal-bench example wiki from the exploration. Repoints the
README reading-order + layout to wiki-twobatch-skills, fixes the docs that
attributed worked examples to it (schema.md now points at the wiki-twobatch
arms; example index rows retagged), and corrects stale relative links the
docs carried from the original tree (../plugin-source → ../skills,
../WIKIS.md removed, ../experiments/wiki-build-comparison.md → RESULTS-SUMMARY
§3–4, design.md/schema.md cross-links to renamed filenames). Skill example
paths (consult, ingest) repointed off the removed wiki.

Remaining wikis: wiki-twobatch {base, skills, both, pruned}. All intra-doc
relative links resolve; leak gate clean.
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@vinodmut, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 24 minutes and 56 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c6892d64-bc1e-48c5-a3bf-2141cc315c86

📥 Commits

Reviewing files that changed from the base of the PR and between 26c2884 and 7a1f5ac.

📒 Files selected for processing (3)
  • .pre-commit-config.yaml
  • .secrets.baseline
  • explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py
📝 Walkthrough

Walkthrough

Adds an agent-wiki exploration: README, design and schema docs, skills/templates and default configs, a Dockerized experiment harness and 17-task suite, transcript normalization and metric-extraction tooling, multi-arm comparison scripts and reports, populated JSONL metric datasets, and repo config updates to exclude generated wikis from scans and linters.

Changes

Agent-Wiki Framework Design & Schema

Layer / File(s) Summary
Framework README
explorations/agent-wiki/README.md
Repository-level README introducing the agent-wiki concept, layout, reading order, and scope constraints.
Design principles & pipeline
explorations/agent-wiki/docs/design.md
Design doc specifying provenance, page kinds, lifecycle rules (consolidate/delete-on-promote), pipeline ordering, build patterns, and experimental evidence summary.
On-disk schema & contracts
explorations/agent-wiki/docs/schema.md
Schema reference for page kinds, YAML frontmatter, index/config/audit artifacts, linking rules, promotion/archival mechanics, and worked examples.

Experimental Validation & Result Analysis

Layer / File(s) Summary
Experiment harness & task suite
explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py, explorations/agent-wiki/experiments/harness/wiki_consult_tasks.yaml
CLI harness that creates per-trial workspaces, runs claude-sandbox sessions with condition-specific setup, parses stream-json to detect AGENTS.md/guideline reads and assistant text, scores signals, and writes runs/transcripts/summary; includes 17 prompt-driven tasks.
Transcript normalization & metrics extraction
explorations/agent-wiki/experiments/harness/normalize_stream_json_transcripts.py, explorations/agent-wiki/experiments/harness/extract_trial_metrics.py
Normalize stream-json transcripts to an OpenAI-chat format and extract per-trial metrics (token counts, tool calls, wiki/index/guideline reads, durations, costs, outcome matching).
Comparison & reporting tools
explorations/agent-wiki/experiments/harness/twobatch_compare.py, .../threeway_compare.py, .../fourway_compare.py, .../fiveway_compare.py
Scripts that load JSONL metric files, group by task/arm, compute medians/accuracies/deltas, and render Markdown comparison reports (aggregate, per-family, per-task).
Experiment reports & metrics
explorations/agent-wiki/experiments/RESULTS-SUMMARY.md, explorations/agent-wiki/experiments/*-comparison.md, explorations/agent-wiki/experiments/metrics/*.metrics.jsonl, explorations/agent-wiki/experiments/pruned-index-hypothesis.md
Comprehensive experiment writeups and JSONL metric datasets (48–95 records per file) used for analysis and comparisons.

Operational Skills & Templates

Layer / File(s) Summary
AGENTS.md template & default config
explorations/agent-wiki/skills/scripts/_default_agents.md, explorations/agent-wiki/skills/scripts/_default_agent_wiki_config.yaml
AGENTS.md template and default YAML config describing consult contract, file layouts, tags/clusters/tasks, and examples.
Consult skill & retrieval contract
explorations/agent-wiki/skills/agent-wiki-consult/SKILL.md
Consult-skill docs describing wiki root resolution, reading AGENTS.md and _index.jsonl, applying retrieval recipes, and surfacing ranked candidate matches.
Summarize / Extract / Synthesize / Ingest
explorations/agent-wiki/skills/*/SKILL.md
Skills documentation covering per-trace summarization, guideline extraction (entities JSON schema), skill synthesis (skill JSON schema and render behavior), consolidation, task comparisons, and full ingest orchestration with step ordering and best-practices.

Repository Tooling & Configuration

Layer / File(s) Summary
Repo config updates
.pre-commit-config.yaml, .secrets.baseline, pyproject.toml
Exclude explorations/agent-wiki/ from detect-secrets scanning (with comments), update .secrets.baseline exclude.files and a recorded sandbox README entry, and extend Ruff/MyPy excludes for generated wikis in pyproject.toml.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • visahak
  • illeatmyhat
  • gaodan-fang

Poem

🐰 I nibble through traces, stitch pages with care,
Little rules and skills bloom from trials laid bare.
From runs and metrics, a tidy guide I spin—
One rabbit's hops turn many agents' win. 🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 17.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'explore(agent-wiki): trajectory-derived wiki — skills, builder, experiments' directly and comprehensively summarizes the main addition: an exploration of an agent-wiki system with skills, builder, and experiments. It is clear, specific, and accurately reflects the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vinodmut vinodmut marked this pull request as ready for review June 10, 2026 06:01
CI (ruff, mypy, detect-secrets) was scanning explorations/agent-wiki/ as
project source — the first content under explorations/ to carry .py files
and high-entropy identifiers. Fixes, scoped so generated example artifacts
are treated like the already-excluded plugin-source/ and examples/ trees:

- ruff: lint + format fixes in the harness scripts + builder; exclude the
  generated wiki scripts (explorations/agent-wiki/wikis/) via extend-exclude.
- mypy: add explorations/agent-wiki/wikis/ to exclude; add file-local
  `# mypy: ignore-errors` to the exploration harness + the builder (a
  verbatim copy of the mypy-excluded plugin-source/ original).
- detect-secrets: exclude explorations/agent-wiki/ in the pre-commit hook
  and .secrets.baseline — the 53 findings are 12-hex guideline content
  hashes and session-id UUIDs, not secrets.

No example-wiki content changed (scripts keep their original names).
Fixes failing CI checks: check-formatting, check-linting, check-typing,
tekton/pr-code-checks/code-detect-secrets.
Drops explorations/agent-wiki/wikis/ (253 generated files, ~10k lines) from
this PR so the diff is the reviewable surface — skills, builder, docs, and
the experiment reports/harness (~34 files). The example wikis are machine-
generated output; bundling them buried the code and appears to have made
CodeRabbit skip deep review (summary only, zero inline findings).

The wikis land in a stacked follow-up PR. README/docs still reference
wikis/wiki-twobatch-* by path; those links resolve once the follow-up
merges. Root-config excludes (ruff/mypy/detect-secrets) are kept — the
detect-secrets exclude still covers example content hashes in docs/schema.md,
and the wiki excludes become live again when the follow-up lands.
@vinodmut

Copy link
Copy Markdown
Contributor Author

Split the generated example wikis into a companion PR #269 (merge after this one) so this diff stays focused on the reviewable code — builder, skills, docs, and experiment harness (34 files vs the original 287). This should let CodeRabbit review the code properly.

@vinodmut

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (13)
explorations/agent-wiki/skills/agent-wiki-synthesize-skill/SKILL.md (1)

205-213: ⚡ Quick win

Add language specifier to fenced code block.

The directory structure example should use text or similar language identifier for consistency.

📝 Suggested fix
-```
+```text
 <wiki>/skills/
 ├── _id_index.json                     skill slug → relpath
 ├── index.md                           alphabetical listing (auto-generated)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@explorations/agent-wiki/skills/agent-wiki-synthesize-skill/SKILL.md` around
lines 205 - 213, Update the fenced code block in SKILL.md that shows the
directory tree for "<wiki>/skills/" to include a language specifier (e.g.,
change the opening ``` to ```text) so the block is marked as plain text; locate
the block in the SKILL.md content that begins with the three backticks followed
by the tree and replace the opening fence accordingly to ensure consistent
formatting.

Source: Linters/SAST tools

explorations/agent-wiki/skills/agent-wiki-consult/SKILL.md (2)

53-55: ⚡ Quick win

Add language specifier to fenced code block.

The code block should specify bash as the language for proper syntax highlighting and consistency with the rest of the documentation.

📝 Suggested fix
-```
+```bash
 Read <wiki-root>/AGENTS.md
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @explorations/agent-wiki/skills/agent-wiki-consult/SKILL.md around lines 53 -
55, Update the fenced code block containing "Read /AGENTS.md" in
SKILL.md to include a language specifier; specifically, change the backticks
that start the block to bash so the snippet is bash Read
/AGENTS.md ``` which enables Bash syntax highlighting and keeps
formatting consistent with other docs.


</details>

<!-- cr-comment:v1:112ba0be983257e2722014cb -->

_Source: Linters/SAST tools_

---

`72-74`: _⚡ Quick win_

**Add language specifier to fenced code block.**

The code block should specify `bash` as the language for proper syntax highlighting and consistency.





<details>
<summary>📝 Suggested fix</summary>

```diff
-```
+```bash
 Read <wiki-root>/_index.jsonl
 ```
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @explorations/agent-wiki/skills/agent-wiki-consult/SKILL.md around lines 72 -
74, Add the language specifier "bash" to the fenced code block that contains the
line "Read /_index.jsonl" in SKILL.md so the block reads as a bash
snippet; locate the triple-backtick fence surrounding that line and change it
from tobash to enable proper syntax highlighting and consistency.


</details>

<!-- cr-comment:v1:a82435713f4f0bdb75db895c -->

_Source: Linters/SAST tools_

</blockquote></details>
<details>
<summary>explorations/agent-wiki/skills/agent-wiki-tasks/SKILL.md (1)</summary><blockquote>

`43-45`: _⚡ Quick win_

**Add language specifier to fenced code block.**

The code block should specify `bash` as the language for consistency.





<details>
<summary>📝 Suggested fix</summary>

```diff
-```
+```bash
 Read /tmp/summaries.json
 ```
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @explorations/agent-wiki/skills/agent-wiki-tasks/SKILL.md around lines 43 -
45, The fenced code block containing the line "Read /tmp/summaries.json" is
missing a language specifier; update the markdown in SKILL.md by changing the
opening fence from tobash so the block reads as a bash code block (i.e.,
use bash before the line and keep the closing ), ensuring consistency with
other fenced blocks.


</details>

<!-- cr-comment:v1:8cf4b500a74a459a81dea92b -->

_Source: Linters/SAST tools_

</blockquote></details>
<details>
<summary>explorations/agent-wiki/skills/agent-wiki-consolidate-guidelines/SKILL.md (1)</summary><blockquote>

`43-45`: _⚡ Quick win_

**Add language specifier to fenced code block.**

The code block should specify `bash` as the language for consistency.





<details>
<summary>📝 Suggested fix</summary>

```diff
-```
+```bash
 Read /tmp/guidelines.json
 ```
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @explorations/agent-wiki/skills/agent-wiki-consolidate-guidelines/SKILL.md
around lines 43 - 45, The fenced code block containing the line "Read
/tmp/guidelines.json" in SKILL.md should include a language specifier; update
that code fence to use "bash" (i.e., change the opening tobash) so the
block reads as a bash snippet for consistency and proper syntax highlighting.


</details>

<!-- cr-comment:v1:5bf51a6b4180929ec5599bc9 -->

_Source: Linters/SAST tools_

</blockquote></details>
<details>
<summary>explorations/agent-wiki/skills/agent-wiki-ingest/SKILL.md (1)</summary><blockquote>

`26-35`: _⚡ Quick win_

**Add language specifier to fenced code block.**

The pipeline diagram should use `text` or similar language identifier for consistency.





<details>
<summary>📝 Suggested fix</summary>

```diff
-```
+```text
 0.  Convert    raw bob / claude traces → normalized analysis JSON   (skip if already normalized)
 1.  Bootstrap  create wiki scaffold + seed catalog                  (skip if wiki exists)
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @explorations/agent-wiki/skills/agent-wiki-ingest/SKILL.md around lines 26 -
35, The fenced code block that lists the pipeline steps (starting with "0.
Convert raw bob / claude traces → normalized analysis JSON") in SKILL.md is
missing a language specifier; update the opening fence from totext (or
another plain language like text) so the block reads ```text and preserves
formatting/consistency across renderers.


</details>

<!-- cr-comment:v1:6330421252c87cc7389c0754 -->

_Source: Linters/SAST tools_

</blockquote></details>
<details>
<summary>explorations/agent-wiki/skills/scripts/_default_agents.md (3)</summary><blockquote>

`35-55`: _💤 Low value_

**Add language specifier to fenced code block.**

The directory structure code block should specify a language (e.g., `text`) to satisfy markdown linting best practices.





<details>
<summary>📝 Proposed fix</summary>

```diff
-```
+```text
 <wiki-root>/
 ├── AGENTS.md          ← this file
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @explorations/agent-wiki/skills/scripts/_default_agents.md around lines 35 -
55, The fenced directory-structure block at the top of AGENTS.md is missing a
language tag; update the opening fence from totext so the block is
marked as plain text (modify the code block that begins with the tree under
/ in AGENTS.md).


</details>

<!-- cr-comment:v1:f6308da14e8d4796aead671d -->

_Source: Linters/SAST tools_

---

`123-131`: _💤 Low value_

**Add language specifier to fenced code block.**

The provenance chain code block should specify a language (e.g., `text`) to satisfy markdown linting best practices.





<details>
<summary>📝 Proposed fix</summary>

```diff
-```
+```text
 guideline.md
   ↓ frontmatter `related_summary:`
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @explorations/agent-wiki/skills/scripts/_default_agents.md around lines 123 -
131, The fenced provenance chain block in _default_agents.md lacks a language
specifier which fails markdown linting; update the opening triple-backtick for
that block to include a language (for example change totext) so the
block becomes a labeled text code fence, ensuring the block around the lines
starting with "guideline.md" and the following arrows is annotated (look for the
triple-backtick that opens that specific provenance chain block).


</details>

<!-- cr-comment:v1:2ab25b3aa9b4a9092e5a0af8 -->

_Source: Linters/SAST tools_

---

`151-152`: _⚡ Quick win_

**Clarify the guideline reference format in the example.**

Line 152 shows `474bb2ba1076` as a guideline reference, but according to the structure at line 46, atomic guideline files follow the pattern `<slug>__<gid>.md`. The example should either show the full filename or clarify that this is just the ID portion from the `_index.jsonl`.





<details>
<summary>📝 Suggested clarification</summary>

```diff
-   - `474bb2ba1076` "Promote a feature mode to a top-level flag, not an
-     option" (atomic; tags include `cli`, `ux`, `workspace`).
+   - `474bb2ba1076` (id) "Promote a feature mode to a top-level flag, not an
+     option" (atomic; tags include `cli`, `ux`, `workspace`; 
+     file: `guidelines/promote-mode-to-flag__474bb2ba1076.md`).
```

Or simply add a note that IDs shown are from `_index.jsonl` and map to full filenames via the pattern.
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@explorations/agent-wiki/skills/scripts/_default_agents.md` around lines 151 -
152, Update the example in _default_agents.md so the guideline reference is
unambiguous: replace or augment the bare ID `474bb2ba1076` with the full atomic
filename pattern `<slug>__474bb2ba1076.md` (or show the actual example full
filename), or add a short note stating that IDs shown (like `474bb2ba1076`) are
the gid field from `_index.jsonl` and map to atomic files via the
`<slug>__<gid>.md` convention; ensure this change is applied where the example
appears so readers can map the ID to the full guideline filename.
```

</details>

<!-- cr-comment:v1:6be9e0d1da5cd4915083e5de -->

</blockquote></details>
<details>
<summary>explorations/agent-wiki/experiments/harness/twobatch_compare.py (2)</summary><blockquote>

`232-249`: _💤 Low value_

**Move `TASK_IDS_ORDER` constant to module top for conventional readability.**

The constant is defined at line 232 but used in `main()` at line 192. While Python's lazy evaluation makes this work, conventional practice places module-level constants immediately after imports. This improves readability and makes it clear which values are configurable.





<details>
<summary>Suggested placement</summary>

Move lines 232-249 to appear immediately after line 21 (after `REPO` definition and before `FAMILY`).

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@explorations/agent-wiki/experiments/harness/twobatch_compare.py` around lines
232 - 249, TASK_IDS_ORDER is defined far down the file but used in main(); move
the TASK_IDS_ORDER constant up to the top-level constants area (immediately
after the existing REPO constant and before FAMILY) so it’s declared with other
module-level configuration. Update its placement only — keep the exact name
TASK_IDS_ORDER and do not change its contents or usages (e.g., references inside
main()) to restore conventional readability.
```

</details>

<!-- cr-comment:v1:f6a2be203b773f2ca558acac -->

---

`23-79`: _⚡ Quick win_

**Consider extracting shared constants and utilities to reduce duplication.**

Both `twobatch_compare.py` and `threeway_compare.py` duplicate `FAMILY` dict, `TASK_IDS_ORDER` list, and helper functions (`median`/`mean`, `fmt`, `delta`/`delta_str`, `acc`). For exploration code, self-contained scripts may be intentional, but if these tools will be maintained or extended, consolidating ~80 lines of shared logic into a `comparison_utils.py` module would reduce drift and simplify updates.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@explorations/agent-wiki/experiments/harness/twobatch_compare.py` around lines
23 - 79, Extract the duplicated constants and helper functions into a new module
(e.g., comparison_utils.py): move FAMILY and TASK_IDS_ORDER plus median_or_none,
mean_or_none, fmt, delta_str (and any other shared helpers like acc/delta) into
that module, update twobatch_compare.py and threeway_compare.py to import those
symbols instead of redefining them, and remove the duplicate definitions from
both files so they reference the shared implementations.
```

</details>

<!-- cr-comment:v1:5a2f1f797deec619cfc823db -->

</blockquote></details>
<details>
<summary>explorations/agent-wiki/experiments/harness/fourway_compare.py (1)</summary><blockquote>

`20-20`: _⚡ Quick win_

**Remove unused `REPO` variable in both comparison scripts.**

Both `fourway_compare.py` and `fiveway_compare.py` define `REPO = Path(__file__).resolve().parents[1]` but never reference it. This appears to be copy-paste boilerplate that can be removed.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@explorations/agent-wiki/experiments/harness/fourway_compare.py` at line 20,
Remove the unused REPO variable declaration (REPO =
Path(__file__).resolve().parents[1]) from the comparison scripts; locate the
top-level REPO assignment in fourway_compare.py and fiveway_compare.py and
delete that line to eliminate dead/copy-pasted boilerplate, ensuring no other
code references REPO afterward.
```

</details>

<!-- cr-comment:v1:82bef0e5f751d47f51a6fc55 -->

</blockquote></details>
<details>
<summary>explorations/agent-wiki/experiments/RESULTS-SUMMARY.md (1)</summary><blockquote>

`408-432`: _💤 Low value_

**Add language identifier to fenced code block.**

The file-map code fence should specify `text` or leave it blank explicitly. As per static analysis, fenced code blocks should have a language specified.





<details>
<summary>🔧 Proposed fix</summary>

````diff
-```
+```text
 explorations/agent-wiki/experiments/
 ├── RESULTS-SUMMARY.md                     this file
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@explorations/agent-wiki/experiments/RESULTS-SUMMARY.md` around lines 408 -
432, The fenced directory tree at the top of RESULTS-SUMMARY.md is missing a
language identifier; update the opening code fence for the file-map block (the
triple-backticks that surround the explorations/agent-wiki/experiments/ tree) to
include a language identifier (e.g., change ``` to ```text) so static analysis
recognizes the block language.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py`:
- Line 47: REPO_ROOT is set using Path(__file__).resolve().parents[2] which
points to explorations/agent-wiki/ (too shallow); change the parent index to
parents[4] so REPO_ROOT points to the repository root; update the expression
Path(__file__).resolve().parents[2] to Path(__file__).resolve().parents[4]
(symbol: REPO_ROOT) so all derived paths that use REPO_ROOT (used later on lines
referencing the same variable) resolve correctly.
- Around line 425-426: The median calculation using median = durs[n // 2] is
incorrect for even-length lists; update the logic after computing durs and n so
that you compute mid = n // 2 and set median = durs[mid] when n is odd,
otherwise set median = (durs[mid - 1] + durs[mid]) / 2.0 to return the average
of the two middle values (ensure the result is a float); reference the existing
variables durs, n, median and rows to locate and replace the incorrect line.
- Line 310: The tasks_file path is pointing to the wrong location; update the
tasks_file assignment (the variable named tasks_file which currently uses
REPO_ROOT / "tests" / "e2e" / "wiki_consult_tasks.yaml") to the actual file
location by constructing the path as REPO_ROOT / "explorations" / "agent-wiki" /
"experiments" / "harness" / "wiki_consult_tasks.yaml" so the script can load the
correct YAML; ensure you use the same Path joining style used elsewhere with
REPO_ROOT.
- Around line 114-120: The helper _seed_format_group currently does a local
import "from _format_samples import seed_into" but there is no _format_samples
module in the harness directory, causing runtime import errors; fix by providing
a real implementation or correct import: either add a new _format_samples.py
next to experiment_wiki_consult.py that implements seed_into(ws: Path, group:
str) -> list[str], or change the import in _seed_format_group to point to the
existing module that provides seed_into (or inline the seed_into logic into
_seed_format_group) so that the function _seed_format_group calls a valid
seed_into symbol at runtime.

In `@explorations/agent-wiki/experiments/RESULTS-SUMMARY.md`:
- Line 368: Replace the typo "byes" with "bytes" in the sentence fragment that
currently reads "it reads MORE byes (cache-creation..." in RESULTS-SUMMARY.md so
the sentence reads "it reads MORE bytes (cache-creation..."; search for the
exact phrase "it reads MORE byes" to locate the spot and update only that word.

In `@explorations/agent-wiki/skills/scripts/_default_agents.md`:
- Around line 167-171: Update the bootstrap command path in the markdown so it
points to the correct script location: replace occurrences of
"plugin-source/skills/agent-wiki/scripts/build_agent_wiki.py" with
"explorations/agent-wiki/skills/scripts/build_agent_wiki.py" in the AGENTS.md
bootstrap instructions (the sentence containing the command `uv run python ...
build_agent_wiki.py --wiki-root <wiki-root> catalog`) so the documented command
matches the actual repository layout.
- Around line 28-29: The Structure section in _default_agents.md is missing the
generated guidelines/index.md entry; update the Structure listing to include
guidelines/index.md alongside the existing guidelines/<slug>__<gid>.md and
guidelines/<slug>__cluster.md entries so the docs match what
explorations/agent-wiki/skills/scripts/build_agent_wiki.py generates; edit the
"Structure" block in _default_agents.md to add a line for guidelines/index.md
and ensure formatting/ordering is consistent with the other guideline entries.

---

Nitpick comments:
In `@explorations/agent-wiki/experiments/harness/fourway_compare.py`:
- Line 20: Remove the unused REPO variable declaration (REPO =
Path(__file__).resolve().parents[1]) from the comparison scripts; locate the
top-level REPO assignment in fourway_compare.py and fiveway_compare.py and
delete that line to eliminate dead/copy-pasted boilerplate, ensuring no other
code references REPO afterward.

In `@explorations/agent-wiki/experiments/harness/twobatch_compare.py`:
- Around line 232-249: TASK_IDS_ORDER is defined far down the file but used in
main(); move the TASK_IDS_ORDER constant up to the top-level constants area
(immediately after the existing REPO constant and before FAMILY) so it’s
declared with other module-level configuration. Update its placement only — keep
the exact name TASK_IDS_ORDER and do not change its contents or usages (e.g.,
references inside main()) to restore conventional readability.
- Around line 23-79: Extract the duplicated constants and helper functions into
a new module (e.g., comparison_utils.py): move FAMILY and TASK_IDS_ORDER plus
median_or_none, mean_or_none, fmt, delta_str (and any other shared helpers like
acc/delta) into that module, update twobatch_compare.py and threeway_compare.py
to import those symbols instead of redefining them, and remove the duplicate
definitions from both files so they reference the shared implementations.

In `@explorations/agent-wiki/experiments/RESULTS-SUMMARY.md`:
- Around line 408-432: The fenced directory tree at the top of
RESULTS-SUMMARY.md is missing a language identifier; update the opening code
fence for the file-map block (the triple-backticks that surround the
explorations/agent-wiki/experiments/ tree) to include a language identifier
(e.g., change ``` to ```text) so static analysis recognizes the block language.

In `@explorations/agent-wiki/skills/agent-wiki-consolidate-guidelines/SKILL.md`:
- Around line 43-45: The fenced code block containing the line "Read
/tmp/guidelines.json" in SKILL.md should include a language specifier; update
that code fence to use "bash" (i.e., change the opening ``` to ```bash) so the
block reads as a bash snippet for consistency and proper syntax highlighting.

In `@explorations/agent-wiki/skills/agent-wiki-consult/SKILL.md`:
- Around line 53-55: Update the fenced code block containing "Read
<wiki-root>/AGENTS.md" in SKILL.md to include a language specifier;
specifically, change the backticks that start the block to ```bash so the
snippet is ```bash Read <wiki-root>/AGENTS.md ``` which enables Bash syntax
highlighting and keeps formatting consistent with other docs.
- Around line 72-74: Add the language specifier "bash" to the fenced code block
that contains the line "Read <wiki-root>/_index.jsonl" in SKILL.md so the block
reads as a bash snippet; locate the triple-backtick fence surrounding that line
and change it from ``` to ```bash to enable proper syntax highlighting and
consistency.

In `@explorations/agent-wiki/skills/agent-wiki-ingest/SKILL.md`:
- Around line 26-35: The fenced code block that lists the pipeline steps
(starting with "0.  Convert    raw bob / claude traces → normalized analysis
JSON") in SKILL.md is missing a language specifier; update the opening fence
from ``` to ```text (or another plain language like `text`) so the block reads
```text and preserves formatting/consistency across renderers.

In `@explorations/agent-wiki/skills/agent-wiki-synthesize-skill/SKILL.md`:
- Around line 205-213: Update the fenced code block in SKILL.md that shows the
directory tree for "<wiki>/skills/" to include a language specifier (e.g.,
change the opening ``` to ```text) so the block is marked as plain text; locate
the block in the SKILL.md content that begins with the three backticks followed
by the tree and replace the opening fence accordingly to ensure consistent
formatting.

In `@explorations/agent-wiki/skills/agent-wiki-tasks/SKILL.md`:
- Around line 43-45: The fenced code block containing the line "Read
/tmp/summaries.json" is missing a language specifier; update the markdown in
SKILL.md by changing the opening fence from ``` to ```bash so the block reads as
a bash code block (i.e., use ```bash before the line and keep the closing ```),
ensuring consistency with other fenced blocks.

In `@explorations/agent-wiki/skills/scripts/_default_agents.md`:
- Around line 35-55: The fenced directory-structure block at the top of
AGENTS.md is missing a language tag; update the opening fence from ``` to
```text so the block is marked as plain text (modify the code block that begins
with the tree under <wiki-root>/ in AGENTS.md).
- Around line 123-131: The fenced provenance chain block in _default_agents.md
lacks a language specifier which fails markdown linting; update the opening
triple-backtick for that block to include a language (for example change ``` to
```text) so the block becomes a labeled text code fence, ensuring the block
around the lines starting with "guideline.md" and the following arrows is
annotated (look for the triple-backtick that opens that specific provenance
chain block).
- Around line 151-152: Update the example in _default_agents.md so the guideline
reference is unambiguous: replace or augment the bare ID `474bb2ba1076` with the
full atomic filename pattern `<slug>__474bb2ba1076.md` (or show the actual
example full filename), or add a short note stating that IDs shown (like
`474bb2ba1076`) are the gid field from `_index.jsonl` and map to atomic files
via the `<slug>__<gid>.md` convention; ensure this change is applied where the
example appears so readers can map the ID to the full guideline filename.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 03d507b1-88b4-4d95-926c-e74bcaf7cd25

📥 Commits

Reviewing files that changed from the base of the PR and between 6de3712 and 8cadfa6.

📒 Files selected for processing (34)
  • .pre-commit-config.yaml
  • .secrets.baseline
  • explorations/agent-wiki/README.md
  • explorations/agent-wiki/docs/design.md
  • explorations/agent-wiki/docs/schema.md
  • explorations/agent-wiki/experiments/RESULTS-SUMMARY.md
  • explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py
  • explorations/agent-wiki/experiments/harness/extract_trial_metrics.py
  • explorations/agent-wiki/experiments/harness/fiveway_compare.py
  • explorations/agent-wiki/experiments/harness/fourway_compare.py
  • explorations/agent-wiki/experiments/harness/normalize_stream_json_transcripts.py
  • explorations/agent-wiki/experiments/harness/threeway_compare.py
  • explorations/agent-wiki/experiments/harness/twobatch_compare.py
  • explorations/agent-wiki/experiments/harness/wiki_consult_tasks.yaml
  • explorations/agent-wiki/experiments/metrics/pruned-fixed-9atomic.metrics.jsonl
  • explorations/agent-wiki/experiments/metrics/twobatch-both.metrics.jsonl
  • explorations/agent-wiki/experiments/metrics/twobatch-skills.metrics.jsonl
  • explorations/agent-wiki/experiments/metrics/twobatch.metrics.jsonl
  • explorations/agent-wiki/experiments/pruned-index-hypothesis.md
  • explorations/agent-wiki/experiments/twobatch-comparison.md
  • explorations/agent-wiki/experiments/twobatch-fiveway-comparison.md
  • explorations/agent-wiki/experiments/twobatch-fourway-comparison.md
  • explorations/agent-wiki/experiments/twobatch-skills-comparison.md
  • explorations/agent-wiki/skills/agent-wiki-consolidate-guidelines/SKILL.md
  • explorations/agent-wiki/skills/agent-wiki-consult/SKILL.md
  • explorations/agent-wiki/skills/agent-wiki-extract-guidelines/SKILL.md
  • explorations/agent-wiki/skills/agent-wiki-ingest/SKILL.md
  • explorations/agent-wiki/skills/agent-wiki-summarize/SKILL.md
  • explorations/agent-wiki/skills/agent-wiki-synthesize-skill/SKILL.md
  • explorations/agent-wiki/skills/agent-wiki-tasks/SKILL.md
  • explorations/agent-wiki/skills/scripts/_default_agent_wiki_config.yaml
  • explorations/agent-wiki/skills/scripts/_default_agents.md
  • explorations/agent-wiki/skills/scripts/build_agent_wiki.py
  • pyproject.toml

Comment thread explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py Outdated
Comment thread explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py Outdated
Comment thread explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py Outdated
Comment thread explorations/agent-wiki/experiments/RESULTS-SUMMARY.md Outdated
Comment thread explorations/agent-wiki/skills/scripts/_default_agents.md
Comment thread explorations/agent-wiki/skills/scripts/_default_agents.md
vinodmut added 2 commits June 10, 2026 07:19
P1 — fresh catalog bootstrap crash: cmd_catalog now creates summaries/,
  guidelines/, tasks/, skills/ before any index writer runs. A `catalog`
  on a bare wiki-root no longer FileNotFounds on summaries/index.md.

P1 — skill docs referenced non-existent paths: repointed all 23
  build_agent_wiki.py invocations and the normalizer reference from
  plugin-source/… and scripts/… to the in-tree
  explorations/agent-wiki/skills/scripts/ and …/experiments/harness/ paths
  (across the 7 skills + _default_agents.md).

P1 — harness reproducibility: experiment_wiki_consult.py is marked
  REFERENCE ONLY (it needs project-level sandbox assets — docker image,
  demo workspace, hint plugin, _format_samples — not shipped here); the
  tasks-file path now resolves to the checked-in harness/wiki_consult_tasks.yaml.
  README's "reproduce" wording split into re-runnable compare scripts vs the
  reference-only A/B runner.

P2 — render-cluster --archive-members broke member links: archive members
  BEFORE rendering the cluster page, and resolve each member to its real
  location — sibling in guidelines/, or ../_archived/<name>.md when archived.
  Links and titles now resolve in both modes.

P2 — README described moved-out wikis: the example wikis live in the
  companion PR; README layout/reading-order/scope updated accordingly.

Also: stripped trailing EOF blank lines in twobatch-comparison.md and
twobatch-skills-comparison.md (git diff --check).
CodeRabbit re-reviewed the focused (code-only) PR and flagged 7 items; 3 were
already fixed by the prior commit (REPO_ROOT, tasks_file path, build-script
path — CodeRabbit confirmed resolved). The remaining 4:

- [major] _format_samples import: wrap the deferred import in a clear
  RuntimeError explaining it's a project-level sandbox asset absent from this
  reference-only runner, instead of a bare ImportError.
- [minor] median was durs[n//2] — wrong for even-length trial lists; now
  averages the two middle values for even n (default --trials 3 unaffected).
- [minor] typo "byes" -> "bytes" in RESULTS-SUMMARY.md.
- [minor] _default_agents.md Structure tree: add the per-section index.md
  entries (summaries/guidelines/skills/tasks) the catalog regenerates.
@vinodmut

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@visahak

visahak commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR adds the agent-wiki exploration: reference skills, a deterministic wiki builder, experiment reports, metric rollups, and harness scripts for evaluating trajectory-derived wiki/skill
artifacts. It does not change core altk_evolve package behavior, and the branch is current with upstream/main (6de3712) and PR head (26c2884).

Findings

  1. Reference A/B runner resolves project paths under explorations/agent-wiki instead of the repo root (confidence: 100/100)
    • Why it matters: the runner documents usage “within the full project,” but it cannot find project-level assets such as demo/workspace, platform-integrations, or tests/e2e/_wiki_hint_plugin,
      so the reference reproduction path fails before running a sandbox trial.
    • Evidence: explorations/agent-wiki/experiments/harness/experiment_wiki_consult.py:36 documents running from the full project, but REPO_ROOT = Path(__file__).resolve().parents[2] at line 57
      resolves to explorations/agent-wiki. DEMO_WORKSPACE is then built from that wrong root at line 77.
    • Repro: running with a fake wiki and /private/tmp output fails with FileNotFoundError: .../explorations/agent-wiki/demo/workspace. The real repo asset exists at demo/workspace.
  2. Secret scanning is disabled for the entire explorations/agent-wiki tree (confidence: 90/100)
    • Why it matters: the comment says the false positives come from generated wiki content and schema/example IDs, but the hook now skips all hand-written code, docs, harness scripts, and future files
      under explorations/agent-wiki. That weakens the repo’s secret-prevention gate more broadly than needed.
    • Evidence: .pre-commit-config.yaml:52-55 excludes ^explorations/agent-wiki/, and .secrets.baseline:2-4 repeats the same broad exclude. Ruff/mypy already use the narrower generated-output path
      explorations/agent-wiki/wikis/ in pyproject.toml:109-112 and pyproject.toml:167-172

Testing

  • uv run pytest -m e2e -v with .env loaded: failed with 3 failed, 196 passed, 593 deselected, 1 warning.
  • Failure details: the failures appear unrelated to PR 268:
    • tests/e2e/test_claude_sandbox_learn_recall.py::test_claude_learn_then_recall_flow: model/sandbox behavior, Claude invoked exiftool despite recall.
    • tests/e2e/test_codex_sandbox_learn_recall.py::test_codex_learn_then_recall_flow: OpenAI/Codex auth failed with 401 Unauthorized.
    • tests/platform_integrations/test_subscribe.py::TestSubscribe::test_warns_when_audit_write_fails: existing audit-log permission behavior mismatch.
  • Additional validation: git diff --check passed; ruff check . --no-fix passed; mypy . passed; py-compile of new scripts passed; builder help/smoke tests passed; comparison scripts passed when
    run from their harness directory.

vinodmut added 2 commits June 10, 2026 09:30
1. Harness REPO_ROOT resolved to explorations/agent-wiki (parents[2]) instead
   of the repo root, so the reference A/B runner couldn't find project assets
   (demo/workspace, platform-integrations/, tests/e2e/_wiki_hint_plugin). The
   script moved from tests/e2e/ (where parents[2] was the root) down two levels
   to experiments/harness/; REPO_ROOT is now parents[4] (the real repo root),
   matching the documented "run from the full project" usage.

2. detect-secrets exclude was over-broad (^explorations/agent-wiki/), disabling
   the secret gate over all hand-written code/docs/harness there. Narrowed to
   only the generated example-wiki tree and the schema doc's worked examples
   (^explorations/agent-wiki/wikis/ + docs/schema.md) — the only paths whose
   12-hex content hashes / session UUIDs trip the high-entropy detector. This
   mirrors the ruff/mypy scoping (wikis/ only). Applied in both
   .pre-commit-config.yaml and .secrets.baseline.
Accidentally added in the prior commit by `git add -A`; this is a local
review-notes scratch file, not part of the exploration.
@vinodmut

Copy link
Copy Markdown
Contributor Author

@visahak thanks — both addressed in d0e0850:

  1. REPO_ROOT — you're right, parents[2] resolved to explorations/agent-wiki (the script moved down two levels from its original tests/e2e/ home, where parents[2] was the root). Fixed to parents[4] so the reference runner finds demo/workspace, platform-integrations/, and tests/e2e/_wiki_hint_plugin when run from the full repo.

  2. detect-secrets exclude — narrowed from ^explorations/agent-wiki/ to just the generated tree + the schema doc's worked examples (wikis/ + docs/schema.md), so hand-written code/docs/harness stay scanned. Now matches the ruff/mypy scoping. Applied in both .pre-commit-config.yaml and .secrets.baseline.

On the e2e failures: agreed they're unrelated to this PR (sandbox/model behavior, Codex 401 auth, a pre-existing audit-log test) — this exploration doesn't touch the altk_evolve package.

experiment_wiki_consult.py rendered the summary footer with
runs_path.relative_to(REPO_ROOT) / transcripts_dir.relative_to(REPO_ROOT),
which raised ValueError at the very end of a run when --out-root pointed at
an absolute path outside the repo. Added a _display_path() helper that returns
the repo-relative form when the path is under REPO_ROOT and the absolute path
otherwise. In-repo out-roots still render relative; external ones no longer
crash.

(The other open finding in the review notes — over-broad detect-secrets
exclude — was already narrowed to wikis/ + docs/schema.md in d0e0850.)

@visahak visahak left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@vinodmut vinodmut merged commit 3e26154 into AgentToolkit:main Jun 10, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants